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ABSTRACT 

We present the sixth catalog of Kepler candidate planets based on nearly 4 years of high precision 
photometry. This catalog builds on the legacy of previous catalogs released by the Kepler project 
and includes 1493 new Kepler Objects of Interest (KOIs) of which 554 are planet candidates, and 131 
of these candidates have best fit radii < 1.5 i?®. This brings the total number of KOIs and planet 
candidates to 7305 and 4173 respectively. We suspect that many of these new candidates at the 
low signal-to-noise limit may be false alarms created by instrumental noise, and discuss our efforts 
to identify such objects. We re-evaluate all previously published KOIs with orbital periods of > 50 
days to provide a consistently vetted sample that can be used to improve planet occurrence rate 
calculations. We discuss the performance of our planet detection algorithms, and the consistency of 
our vetting products. The full catalog is publicly available at the NASA Exoplanet Archive. 
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1. INTRODUCTION 

The NASA Ae^/er mission (Koch et ah 2010) offers an 
unprecedented view of the time-domain Universe. Fol¬ 
lowing Kepler's launch in 2009, it obtained nearly un¬ 
interrupted observations of a single large field (over 100 
square degrees) centered 13.5° above the Galactic plane. 
These long duration, near continuous observations are 
needed to achieve the mission goal of detecting the tran¬ 
sit of Earth size planets in the habitable zone of Sun-like 
stars. 

Kepler data is a uniquely valuable observational re¬ 
source. The volume of data has encouraged the devel¬ 
opment of new techniques to detrend time series data 
(Coughlin et al. 2012; Roberts et al. 2013; Ambikasaran 
et al. 2014; Handberg et al. 2014), and facilitates sci¬ 
ence not directly related to the core Kepler science of 
finding planets. Highlights include asteroseismology of 
main-sequence stars and red giants, (e.g., Chaplin et al. 
2014; Bedding et al. 2011), classical pulsators (e.g., Szabo 
et al. 2010), eclipsing binaries (e.g., Conroy et al. 2014), 
ages of clusters (Meibom et al. 2011), and active galactic 

BAERI/NASA Ames Research Center, Moffett Field, CA 
94035, USA 

Institut fiir Astrophysik, Universitat Gottingen, Friedrich- 
Hund-Platz 1, D-37077 Gottingen, Germany 

Northwestern University Department of Physics and Astron¬ 
omy Center for Interdisciplinary Research and Exploration in As¬ 
tronomy (CIERA) 2145 Sheridan Road, Evanston, IL 60208, USA 

Sagan Fellow 

Sydney Institute for Astronomy (SIfA), School of Physics, 
University of Sydney, NSW 2006, Australia 

Dept. Astronomy, University of California, Berkeley, CA 

94720 



2 


Mullally et al. 


nuclei (Mushotzky et al. 2011). 

The project has produced several catalogs of planet 
candidates (Borucki et al. 201 la,b; Batalha et al. 2013; 
Burke et al. 2014; Rowe et al. 2015). These catalogs pro¬ 
vide timely updates of new, interesting individual objects 
suitable for ground-based follow-up, and the necessary 
data for planet occurrence rate calculations as a func¬ 
tion of radius, orbital period and other properties (e.g., 
Youdin 2011; Catanzarite et al. 2011; Howard et al. 2012; 
Morton 2012; Dressing et al. 2013; Dong et al. 2013; Mul¬ 
ders et al. 2014). Independent groups have constructed 
their own catalogs (Petigura et al. 2013; Schmitt et al. 
2014; Sanchis-Ojeda et al. 2014), and derive their own oc¬ 
currence rates (Foreman-Mackey et al. 2014). The com¬ 
parison of these independent efforts will help identify and 
correct the inevitable insufficiencies that any approach 
will suffer for a data set of this size and richness. The 
true false positive rates of our catalogs are not known 
(e.g.. Dressing et al. 2014; Lissauer et al. 2014; Rowe 
et al. 2014; Fressin et al. 2013; Morton et al. 2011), and 
this must be addressed when deriving occurrence rates. 

Observations of the Kepler field ended on 2013 May 
11 with the failure of the second of four on-board re¬ 
action wheels. The spacecraft could no longer maintain 
the required pointing precision in the original field and 
was re-purposed for an ecliptic plane mission (K2; Howell 
et al. 2014). 

The catalog presented here is the first based on an es¬ 
sentially complete data set on the original Kepler field 
(one more month of data was collected after work on 
this catalog was started, but does not noticeably add to 
the timespan of observations). Future catalogs will in¬ 
clude the complete data set, but will focus on finding 
new planets through improvement in our detection algo¬ 
rithms instead of relying on the longer baseline. With 4 
years of data, we become sensitive for the first time to 
transit depths ~ 200 ppm at periods of 200 days or more 
- the parameter space of Earth analogs around Sun-like 
stars, and an important goal of the mission. 

In this article, we report on our efforts to detect and vet 
new planet candidates with 46 months of data. We also 
re-vet known candidates from older catalogs with periods 
>50 days to provide a consistently vetted sample to aid 
occurrence rate calculations. This period cut eliminates 
most of the false positives due to eclipsing binaries. We 
add 1493 new objects of interest to the previous catalogs, 
of which 554 we deem to be valid planet candidates. We 
discuss the reliability of these candidates in § 9.1. In an 
effort to reduce the subjectivity inherent in the human- 
dominated false positive identification techniques previ¬ 
ously used, we begin to apply objective, rule-based tests 
to improve the uniformity of our sample, a process that 
will mature in later catalogs. We make our catalog pub¬ 
licly available at the NASA Exoplanet Archive (see § 8). 
Readers whose primary interest is in making use of this 
table may wish to skip to § 7 for a discussion of the fea¬ 
tures of the catalog they should be aware of. 

2. INPUT DATA 

The Q1-Q16 catalog is based on analysis of 16 quar¬ 
ters of data obtained by the Kepler spacecraft from 2009 
May 13 to 2013 April 8. A total of 197,320 targets were 
observed, with 151,763 observed for at least 13 quarters 
(Tenenbaum et al. 2013). The processed lightcurves and 


calibrated pixel images from which the lightcurves are 
derived are available at the Mikulksi Archive for Space 
Telescopes (MAST)^’^. Note that the lightcurves avail¬ 
able at MAST (Data Release 24) come from a newer ver¬ 
sion of the pipeline than was used in this analysis, but 
the differences are not substantial. Commonly observed 
artifacts of the data are described in the Data Character¬ 
istics Handbook (Christiansen et al. 2013a), and impor¬ 
tant features of the quarterly data are described in the 
Data Release Notes for that quarter (Thompson et al. 
2014, 2013). Both the Characteristics Handbook and 
the Release Notes can be found on the MAST webpage. 

3. PIPELINE PROCESSING 

This data set was reduced and analyzed by version 9.1 
of the Science Operations Center pipeline (Jenkins et al. 
2010). The components of the pipeline are referred to 
by short 2-3 letter names. Calibrated pixel fluxes were 
generated by the CAL module (Quintana et al. 2010), 
lightcurves were extracted by the PA module (Twicken 
et al. 2010), and systematic error removal was performed 
by the PDC module (Stumpe et al. 2012; Smith et al. 
2012; Stumpe et al. 2014). The generation of lightcurves 
does not significantly distort the transit signals. Chris¬ 
tiansen et al. (2013b) injected simulated transits into Ke¬ 
pler data and found that (< 1%) of single transits were 
suppressed by these three pipeline components, while the 
signal to noise ratio of the remainder were preserved to 
better than 0.3% on average. 

Potential transits are identified and characterized by 
the TPS (Tenenbaum et al. 2014) and DV (Wu et al. 
2010) modules. These events are known as Threshold 
Crossing Events, or TCEs. A list of the 16,285 TCEs 
found in this pipeline run is available at the NASA Ex¬ 
oplanet Archive (§ 8). 

While an estimate of the number of transit signals not 
detected by TPS (the false negative rate) is a topic of 
active research (Christiansen et al. 2013b), the fraction 
of detections not due to transits (the false alarm rate) is 
high. Of 16,285 TCEs found by TPS in this processing, 
detailed examination determined only 6351 were transit¬ 
like. This is intentional - the science cost of a false alarm 
is very much less than that of a false negative. The core 
of this article details our efforts to winnow valid planet 
candidates from the long list of TCEs. 

3.1. Characteristics of the TCE sample 

Testing of the TPS module by injecting simulated tran¬ 
sits suggests that the recovery rate is close to 100% for 
TCEs with multiple event statistic (MES) > 16, but falls 
to 0 by MES of 6 (Christiansen et al. 2015 in prep), con¬ 
sistent with the estimate of Fressin et al. (2013). MES 
is defined in detail in Jenkins et al. (2002), but is equiv¬ 
alent to the signal-to-noise ratio of the transit detection 
in the folded lightcurve. A MES > 7.1 is required by 
the pipeline for detection of a TCE. TPS is known to 
under-detect short period (< 10 day) planets. Short pe¬ 
riod planets are often mistaken for coherent stellar vari¬ 
ability, which is removed by TPS before searching. Re¬ 
moving this variability increases the yield of longer pe¬ 
riod, more interesting, planets that would otherwise be 
missed. 

http://archive.stsci.edu/kepler 
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Figure 1. Histogram of number of TCEs found by the pipeline as 
a function of log^^Q Period (red histogram). The sharp spike near 
log]^Q Period = 2.6 is due to rolling band noise, and the broader 
peak is due to other sources of three-transit false alarms (see § 3.1). 
The cuts detailed in § 4.2 reduce the sample to the blue histogram, 
while the manual triage process reduces it to the yellow histogram. 
No cuts were made for TCEs with period of 50-300 days (between 
the vertical dashed lines), hence the sudden drop in the number of 
TCEs requiring triage (blue histogram) at 300 days. 

Two different kinds of instrumental artifacts dominate 
the TCE population at long periods (> 200days). Edge 
effects around gaps, stellar flares, and short-timescale 
systematics can cause transit-like signals in the whitened 
lightcurve that are detected at low significance. These 
can combine to produce a significant detection that may 
not be rejected by the various tests TPS uses to iden¬ 
tify false alarms (Seader et al. 2013; Tenenbaum et al. 
2013) The probability of finding n equally spaced events 
of roughly equal significance drops rapidly with increas¬ 
ing n (or decreasing period), so these systematics are 
predominantly an issue for TCEs with three events, or 
periods of hundreds of days. In fact, we believe the num¬ 
ber of false alarm TCEs with three detected transits is 
many times greater than the number of planet candidates 
detected with three transits. This issue, first noted in 
Tenenbaum et al. (2014), causes the broad peak in the 
TCE histogram shown in Figure 1. We discuss our miti¬ 
gation efforts in § 4.2. 

The second source of systematics creates TCEs with 
periods close to the orbital period of the spacecraft. The 
spacecraft rotates around its boresight 4 times per orbit 
to keep the solar panels pointed at the Sun. Each star is 
observed by up to 4 different CCDs, each with their own 
noise properties. This causes an abundance of artifacts 
with periods similar to the orbital period (372 days). In 
particular, a small number of channels suffer from un¬ 
stable readout amplifiers that cause background levels 
to vary in time and CCD position (Caldwell et al. 2010). 
These “rolling bands” cause small amplitude variations 
in the background level underneath stars that are flagged 
by TPS as TCEs. 

Finally, poorly corrected systematic signals in the 
lightcurves can also produce false alarm TCEs, although 
TPS has considerably improved its ability to identify and 
eliminate these problems. We discuss our efforts to cull 
these various false alarms from our catalog in the next 
two sections. 

4. TRIAGE 


Distilling a population of planet candidates from the 
list of TCEs is the focus of the work described in this ar¬ 
ticle, and is performed by the Threshold Crossing Event 
Review Team (TCERT). Following Burke et al. (2014), 
we take a two-tiered approach. The first tier is to quickly 
eliminate the bulk of TCEs that can be readily identified 
as false alarms (defined here as TCEs not produced by 
the transit of one astrophysical body across the face of 
another). This tier, called triage, is discussed in this sec¬ 
tion. Triage proceeds in three steps: Federation, where 
TCEs associated with known KOIs are found; rejection 
of false alarms by rule; and visual examination of the 
remainder. TCEs that pass triage are given a KOI num¬ 
ber and pass to the second step, vetting, where we apply 
further scrutiny to potential false alarms, and identify 
false positives, or transit events caused by binary stars. 
Vetting is described in the next section. 

4.1. Federation 

The federation process involves identifying TCEs cor¬ 
responding to pre-existing KOIs based upon the two pre¬ 
dicting similar in-transit cadences. Periods and epochs 
that agree within the uncertainties may still predict dif¬ 
ferent sets of in-transit cadences over the 4 years of Ke¬ 
pler data. Instead of simply comparing the ephemerides, 
we test whether the in-transit cadences predicted by the 
two ephemerides agree. 

For each KOI, and for each observed cadence, we set a 
boolean value, i/i,KOi to true if the midpoint of the transit 
falls in that cadence. We also calculate the number of 
transits of the KOI that are predicted to occur during 
the observations, Vkoi- 

Next, we generate another boolean vector, ?/i,TCEj of 
transit events for each TCE detected around the same 
target where the period of the TCE is within a factor of 
three of the period of the KOI. Unlike the single cadence 
marking the KOI transit events, this vector is set to true 
for all cadences that occur within the transit duration of 
the mid-transit time as estimated by the TCE ephemeris. 
We also define Vtce, the number of transits of the TCE 
that occur during the observations. 

Finally, we measure the degree of correlation between 
the KOI and TCE transits, 

^ _ ^yi,KOl ^ Vi,TCE _ El/i,KOI X yj^TCE 

Ntce Nkoi 

where the summation is over the observed cadences and 
.7. indicates the logical NOT operation (i.e., the vector 
indicating out-of-transit cadences according to the TCE 
ephemeris). 

The first term of Equation 1 measures the degree to 
which the KOI transit events align with the TCE transit 
events, and the second term penalizes E for having KOI 
transit events align with data out-of-transit according 
to the TCE. The ephemeris correlation statistic ranges 
from — I < E < I, where a value of I indicates that 
all predicted KOI transit events overlap within a tran¬ 
sit duration of the TCE ephemeris, and —I indicates no 
correlation. We require E > 0.8, indicating that at least 
90% of the KOI transit events agree with the TCE, to 
federate a TCE to a KOI. This high level of federation 
ensures that the cadences involved in calculating the vet¬ 
ting metrics in DV correspond to a similar set of cadences 
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being used for identification and analysis of the KOI from 
previous pipeline runs. 

Transits from planets in multi-planet systems may not 
be equally spaced in time due to gravitational perturba¬ 
tion of the orbit. In extreme cases, TPS will not find 
such events. In less extreme cases, TPS may generate a 
TCE, but that TCE will fail federation. 

4.1.1. Recoverability of KOIs 

A good indication of the performance of TPS in a given 
region of parameter space is its ability to rediscover KOIs 
found with previous versions of the pipeline and smaller 
data sets. In the top left panel of Eigure 2, we show 
the distribution of KOIs discovered in previous catalogs 
(Borucki et al. 2011a,b; Batalha et al. 2013; Burke et al. 
2014; Rowe et al. 2015), and also found in run of TPS. 
For clarity, we don’t show individual KOIs, but the den¬ 
sity of KOIs in equally-spaced logarithmic bins of transit 
period and depth. The panel on the right shows the dis¬ 
tribution of KOIs listed in previous catalogs that were 
not rediscovered. These un-recovered KOIs are known 
as “orphans”. The bottom left panel shows the distribu¬ 
tion of newly discovered KOIs. 

As expected, the new KOIs cluster at longer periods 
and shallower depths than the federated KOIs. There is 
also an additional population of deep, short period tran¬ 
sits. Previous catalogs excluded all eclipsing binary (EB) 
candidates identified by Prsa et al. (2011) and Slawson 
et al. (2011). This (the EB) candidate list has some over¬ 
lap with planet candidates, so for the Q1-Q16 pipeline 
run we only excluded EB candidates with a readily dis¬ 
tinguishable secondary eclipse. Thus, the Q1-Q16 search 
included many transit-like systems that are likely short 
period EBs with deep primaries but no detectable sec¬ 
ondary eclipse. 

The orphan KOIs divide into two main clusters, one 
clustered around periods of 1 day, and another tightly 
distributed at a period near 372 days. The short period 
cluster is caused by an over-aggressive pruning of false 
alarms in TPS, which removed some bona-fide short- 
period transits. This problem will be fixed in a fu¬ 
ture version of the pipeline (Seader 2015 in prep). The 
fact that these ~ 1 day period KOIs were not recovered 
should not be used as a reason to doubt their validity. 

The long period cluster (at 300-400 days) is domi¬ 
nated by false alarms due to rolling band noise discussed 
in § 3.1. These artifacts of correlated noise were mis- 
identified as KOIs by Rowe et al. (2015), and were not 
re-detected with the addition of more data. 

If a KOI is not recovered in this pipeline run, it 
should be examined carefully before committing further 
resources to its study. However, non-recovery is not suf¬ 
ficient reason to label it as a false alarm. A KOI may 
be missed because of a change in a pipeline algorithm, 
or a change in the noise properties of the star, reducing 
the signal to noise of the transit below our threshold for 
detection. If the KOI is part of a multiple planet system, 
it may be exhibiting transit timing variations that can 
make the KOI difficult for TPS to recover. 

4.2. Rejection of False Alarms by Rule 

Comparing the distribution of TCE periods from the 
Q1-Q12 and Q1-Q16 results, there is a large increase in 


the number of long-period TCEs. This increase is far 
more than would be expected from analyzing an addi¬ 
tional four quarters of data. Tenenbaum et al. (2014) 
found the long-period population was dominated by ob¬ 
viously non-astrophysical false alarms. 

Applying the cuts suggested in Tenenbaum et al. 
(2014) to the Q1-Q16 TCEs reduced the number of TCEs 
to be vetted from 16,285 to 7,959. Of particular impor¬ 
tance, it reduced the number of TCEs at periods greater 
than 300 days from 6,073 to 1,243, thus removing a large 
number of the long-period false alarms. We show the 
TCE population before and after the cuts in Figure 1. 
To facilitate the occurrence rate calculations of Burke 
et al. (2015 in prep), we did not apply the cuts to TCEs 
with periods 50-300 days. 

4.3. Manual inspection 

After federation, and the cuts mentioned above, we 
are left with 7,959 TCEs. We subject these TCEs to a 
visual inspection to remove obvious false alarms. A min¬ 
imum of two people look at a plot of the folded lightcurve 
contained in the summary report produced by DV (and 
available at the Exoplanet Archive) and identify the TCE 
as either a false alarm or a likely transit. In this, and all 
subsequent steps, we apply the principle of “innocent un¬ 
til proven guilty”; a TCE is passed as a KOI unless there 
is evidence beyond all reasonable doubt that it is not. 
When the two people disagree on the status, a third per¬ 
son resolves the dispute. Transits likely due to EBs are 
passed at this stage. 

4.3.1. Post Triage 

A TCE that passes triage is only assigned a KOI num¬ 
ber if it stands up to some additional scrutiny. Like 
any detrending technique, the whitening algorithm used 
by TPS can exaggerate the depth of some small signals, 
causing unwanted detections. We therefore require that 
the transit signal maintains its integrity after an alter¬ 
native detrending algorithm is applied to the PDC light 
curve. We employ the non-parametric penalized least 
squares method from Garcia (2010) which includes only 
the out-of-transit points when computing the filter. 

Triage-passing TCEs that show evidence for a transit 
signal in the alternative detrended light curve are ele¬ 
vated to KOI status and passed on to the vetting stage. 
For ambiguous cases, we visually examine the individual 
transits and check that they are unique relative to other 
nearby events, uncorrupted by systematic effects, and 
are consistent in depth and duration. We elevate triage¬ 
passing TCEs to KOI status unless a clear problem is 
identified. 

In total, 1493 new KOI assignments result from this 
Q1-Q16 pipeline analysis. Of these 1330 are the first KOI 
around their star, and 163 are new members of multi-KOI 
systems. 

5. VETTING 

After triage, we subjected the KOIs to a more rigorous 
vetting to identify false positives — targets that show 
a transit like event which is not due to the transit of a 
planet around the target star. Vetting comprises three 
independent steps. We summarize these steps below, but 
they are similar to those used by Batalha et al. (2013); 
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Figure 2. (a): Distribution of KOIs previously discovered in earlier catalogs, and recovered in this catalog. We refer to these as “federated” 
KOIs. The bins are equally spaced in the logarithm of orbital period and transit depth. Darker colors indicate a greater density of KOIs 
in that bin. Both planet candidates and false positives are included in this plot, (b): Same as (a), but for previously known KOIs not 
recovered in Q1-Q16. These KOIs are called orphans, (c): KOIs newly discovered in this catalog. For the first time, the KOI population 
extends to transit depths below 200 ppm for periods longer than 300 days. 


Burke et al. (2014), and Rowe et al. (2015). The products 
we use for vetting, and a detailed manual for their use, 
are available at the NASA Exoplanet Archive (see § 8). 

The large number of KOIs makes it difficult to vet 
every case. Instead we looked at two overlapping pop¬ 
ulations: newly identified KOIs, and previously known 
KOIs with periods greater than 50 days. The first set 
provides new targets for follow-up and individual study. 
The second set provides a uniformly vetted set of planet 
candidates suitable for occurrence rate studies. Where a 
previously known KOI with period greater than 50 days 
is part of a multi-KOI system (i.e., multiple KOIs found 
around the same star), we vetted every KOI in that sys¬ 
tem. 

5.1. Flux Vetting 

Flux vetting seeks to further distill out false alarms 
that survived triage, and to detect eclipsing binaries by 
the presence of a secondary eclipse. The transits are 
visually inspected as individual events, and in folded 
lightcurves on a per quarter basis, per season basis, and 
across the full timeseries. KOIs that do not look signifi¬ 
cant compared to the noise in the data are identified and 
marked as false alarms. 

If the primary transit looks real (i.e., statistically sig¬ 


nificant and not caused by data artifacts) we proceed to 
search for evidence of a secondary. The folded lightcurve 
is convolved with the best fit model transit, and the re¬ 
sult is inspected for evidence of a second event. Outliers 
and noisy data can often confuse this metric, so man¬ 
ual inspection is required for this step. Our flux vetting 
procedure is described in detail in Rowe et al. (2015). 

5.2. Centroid Vetting 

Centroid vetting looks for evidence that the source of 
a transit is inconsistent with the location of the target 
star on the sky. It proceeds by computing difference im¬ 
ages, then comparing the flux in- and out-of-transit on 
a per pixel basis for each quarter. Bryson et al. (2013) 
describes the centroid vetting in considerable detail, and 
the approach has remained largely unchanged. Transits 
from resolved sources within the mask of collected pixels 
can be identified by visual inspection, while transits that 
occur on stars unresolved from the target are identified 
by measuring the change in the location of the star’s cen¬ 
troid by fitting model pixel response functions (PRF; the 
Kepler PRF is described in Bryson et al. 2010a) to the 
difference images. What is different with this catalog is 
that we used a machine learning algorithm dubbed the 
centroid robovetter (to be described in Mullally et al. 
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2015 in prep) to automate this procedure for a consid¬ 
erable number of KOIs. The robovetter is a rule-based 
system that seeks to emulate the behavior of TCERT us¬ 
ing the same data products through a series of objective 
tests. 

DV produces per-quarter difference images of the per- 
pixel flux change during transit. The robovetter hrst 
identifies and rejects noisy difference images, then checks 
the remaining images for evidence that the transit source 
is associated with a resolved foreground or background 
source in the observed pixel mask. If the source of the 
transit is unresolved from the target star then the al¬ 
gorithm searches for shifts in the photometric centroids 
during transits using the per-quarter fits to the model 
PRF created by DV. In each step it implements a sim¬ 
ple logic based on the rules built up by TCERT. Those 
KOIs where the automated technique reported a dif¬ 
ficulty reaching a conclusion (mostly in the low SNR 
regime) were audited by two human vetters as before. 

Extensive testing on earlier catalogs shows that the 
robovetter agrees with human determinations at the 90% 
level. Most disagreement occurs for ephemeris match 
false positives (see § 5.3). Neither the robovetter nor 
manual vetting reliably identihes false positives when the 
source of the transit is outside the pixel mask. When 
those KOIs are removed, the agreement with TCERT 
rises to 98%. As a sanity check on the robovetter’s perfor¬ 
mance, we checked the results of the robovetter by hand 
for a small number (100) of KOIs, and found agreement 
at a similar level. 

For very high SNR cases, where the simplifying as¬ 
sumptions used in the transit model break down, the 
DV fit sometimes fails to converge and no difference im¬ 
ages are produced. In these (« 80) cases, we plotted the 
folded per-pixel lightcurves to determine if the pattern 
of measured transit depth across pixels was consistent 
with the transit occurring on the target star. Although 
a much less precise test than the centroid hts, it still 
identified a small number of false positive KOIs. 

5.3. Ephemeris Matching 

Recently Coughlin et al. (2014), hereafter referred to 
as CI4, identified over 100 false positive KOIs not found 
by other techniques by finding TCEs that share a com¬ 
mon period and epoch with an EB or variable star. This 
ephemeris matching technique proved especially useful in 
identifying very low signal-to-noise false positives, which 
were unidentifiable via other techniques. 

For this catalog, we perform ephemeris matching us¬ 
ing similar methods and techniques of CI4. However, 
instead of comparing KOIs to EBs, we started with the 
TCE population, thus comparing TCEs to themselves, 
KOIs, and EBs. Specihcally, we used the following cata¬ 
log sources: 

1. The list of 16,285 Q1-Q16 TCEs from Tenenbaum 
et al. (2014). 

2. The list of 7,286 KOIs, ranging from KOI 1.01 to 
6251.01, available at the NASA Exoplanet Archive 
as of 2014 June 9. These include the new KOIs re¬ 
ported in this paper, as well as KOIs from previous 
Kepler catalogs. 


3. The Kepler eclipsing binary catalog list of 2,522 
“true” EBs found with Kepler data as of 2014 May 
14. The compilation of the catalog and derivation 
of the Ht parameters are described in Kirk (2015 
in prep). Previous versions of this catalog are de¬ 
scribed in Slawson et al. (2011) and Prsa et al. 
( 2011 ). 

4. J.M. Kreiners up-to-date database of ephemerides 
of ground-based eclipsing binaries as of 2014 May 
14. Data compilation and parameter derivation are 
described in Kreiner (2004). 

5. Ground-based eclipsing binaries found via the TrES 
survey (Devor et al. 2008). 

6. The General Catalog of Variable Stars (GCVS, 
Samus et al. 2009) list of all known ground-based 
variable stars, published 2014 April. This catalog 
includes both eclipsing binaries and other periodic 
variable stars, such as pulsators. 

We use the same matching equations as C14 (their 
Eqns. 1-3). Given the larger number of matches when us¬ 
ing the TCE population, and potentially higher number 
of coincidental matches at high significance, we checked 
and confirmed that the signihcance limits for matches 
used for C14 worked equally well to distinguish between 
real, statistically significant ephemeris matches, and ran¬ 
dom coincidences for our sample. 

We ended up with a final list of 960 TCEs with reliable 
enough matches to designate as false positives. With 
a longer baseline than C14, some more extreme period 
ratios are seen, with many at 10:1 or 20:1, and one set of 
TCEs confirmed to be as high as 45:1. The Q1-Q16 TCE 
population contains many false alarms at long periods 
(see § 4.2). As a result, we detected ephemeris matches 
as high as 700:1, (e.g., Kepler ID = 6948098, due to RR 
Lyrae). However, as these are obvious false alarms at 
the given TCE period, easily detected as false positives 
via other means, and time-consuming to separate from 
coincidence matches at this high of a period ratio, we set 
our maximum reported period ratio at 50:1, and thus do 
not record these extreme period ratios (> 50:1) among 
our list of 960 false positives. 

Regarding the mechanism of contamination, the vast 
majority are still caused by Direct PRF contamination. 
Direct contamination from saturated stars can extend 
out to many hundreds of arcseconds, due to the large 
wings of the stellar PRF. The column anomaly, first re¬ 
ported in C14, caused 74 false positive TCEs, 23 false 
positives were due to CCD cross-talk, and only 2 due to 
antipodal reflection. In Figure 3 we plot the location of 
each false positive TCE and its most likely parent, con¬ 
nected by a solid line. TCEs are represented by black 
points, KOIs are represented by green points, EBs found 
by Kepler are represented by red points, and EBs dis¬ 
covered from the ground are represented by blue points. 
The Kepler magnitude of each star is shown via a scaled 
point size. Note that most parent-child pairs are so close 
together that the line connecting them is not easily visi¬ 
ble on the scale of the plot. 

Of the 960 ephemeris match TCEs, 625 are KOIs. The 
remaining 335 TCEs were either identified as the sec¬ 
ondary eclipse of an EB (and not designated as a KOI), 
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Figure 3. Distribution of ephemeris matches on the focal plane. Symbol size scales with magnitude, while color represent the catalog in 
which the contaminating source was found. Blue indicates the true transit is from a star in the catalog of Kreiner (2004), and not observed 
by Kepler. Red circles are stars listed in the Kepler Eclipsing Binary catalog, green are KOIs, and black are TCEs. Black lines connect 
false positive matches with the contaminating parent. In most cases parent and child are so close that the connecting line is invisible. 


or failed triage for some other reason. Rolling band noise 
affects nearby stars with the same period and at the 
same epoch, and thus they can be detected via ephemeris 
matching. Of the 625 false positive KOIs discovered via 
ephemeris matching, 138 have been vetted by TCERT in 
this work, and the remaining 487 are pre-Q16 KOIs with 
P < 50 days. Of the 138, 17 were classified as planet can¬ 
didate by TCERT as a result of this current work, and of 
the 487, 3 were classified as planet candidate as a result 
of previous work. Thus, ephemeris matching provides 
false positive dispositions for an additional 487 KOIs in 
Q1-Q16 vetting, and corrects the disposition of 20 KOIs 
from planet candidate to false positive. 

6. PLANET PARAMETERS 
6.1. Stellar Catalog 

The estimate of the radius of a transiting planet de¬ 
pends on the estimate of the stellar radius. Target se¬ 
lection for Kepler depended on the Kepler Input catalog 
(KIC, Brown et al. 2011). While the KIC can distinguish 


giant and dwarf stars with reasonable success (e.g., Mann 
et al. 2012), the uncertainty in measured stellar surface 
gravity propagates into a large uncertainty in inferred 
planet radius. 

Huber et al. (2014) describe the latest results of an on¬ 
going effort to construct an internally consistent catalog 
of stellar radius drawing from photometry, spectroscopy 
and asteroseismology. We use the catalog described in 
that work to derive planet parameters. It uses the best 
information available to estimate the radius of each in¬ 
dividual star by fitting Dartmouth stellar isochrones to 
observed properties. However, for the majority of the 
planet-candidate sample (~ 80%) the initial estimates of 
atmospheric properties are heavily reliant on the KIC. 
For these stars, the stellar temperatures are updated us¬ 
ing the methodology of Pinsonneault et al. (2012) where 
possible, but the surface gravity estimates are still based 
on the KIC narrow-band photometry. Random uncer¬ 
tainties in KIC derived logo’s have been shown to be 
^ 0.3 dex (Molenda-Zakowicz et al. 2011; Bruntt et al. 
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2012; Huber et al. 2014), and several studies have found 
evidence that KIC logo’s are systematically overesti¬ 
mated for solar-type dwarfs (Verner et al. 2011; Everett 
et al. 2013; Farmer et al. 2013; Bastien et al. 2014), mean¬ 
ing the radii are underestimated. These errors will intro¬ 
duce biases in the derived planet-candidate radii. 

6.2. Planet Parameters 

We re-measure planet parameters for all our KOIs us¬ 
ing the method described by Rowe et al. (2014). This 
provides more realistic uncertainties for our best fit pa¬ 
rameters, as well as providing the opportunity to ht the 
orphans, i.e., KOIs which were not recovered by the 
pipeline with a longer data set. For previously known 
KOIs parameters are unchanged from those reported in 
Rowe et al. (2015). 

We fit all available quarters of data for each KOI, in¬ 
cluding Q17. The data are cleaned and detrended, using 
an algorithm that protects the regions known to contain 
transits. We fit the lightcurve using the transit model 
introduced in Seager et al. (2003) which describes the 
transit with 5 parameters: period, epoch, impact pa¬ 
rameter, the ratio between planet and stellar radius, and 
stellar density. We use a quadratic limb darkening model 
from Claret et al. (2011) and also include a nuisance pa¬ 
rameter to describe any residual mean out-of-transit flux. 
The planet orbit is assumed to be circular. 

The uncertainties in our best fit parameters are some¬ 
what correlated, so we use a Markov Chain Monte Carlo 
(MCMC) approach to improve our error bars, similar to 
that outlined in Ford (2005). We create 4 Markov chains 
of 10® fits each and construct our posterior distribution 
by discarding the first 20% of each chain. We report the 
median value at the Exoplanet Archive (§ 8), and the 
1 a bounds of the distribution as the uncertainty. Tran¬ 
sit depth, duration, and planet radius are not directly 
fit, but computed from the model as described in Rowe 
et al. (2014), properly including the uncertainty in stellar 
radius. For many of our false alarm KOIs, the MCMC 
finds no peak in the posterior distribution for period or 
epoch, indicating that the fit to the transit is of poor 
quality. For these KOIs, or others where the MCMC fits 
fails to give reasonable values, we report only the period, 
epoch and transit duration of the detection. 

6.2.1. Comparison of MCMC and DV fits 

We compare the values from the MCMC hts with those 
from DV’s Marquardt-Levenberg least squares fit (Wu 
et al. 2010). Because the MCMC fits are based on a dif¬ 
ferent detrending, it would be a surprise if their values 
were identical, but we typically find good agreement be¬ 
tween the two approaches (see. Figure 4 for an example). 
However, there are two regions of parameter space where 
the two approaches tend to disagree. 

First, inferred planet radii tend to disagree at short 
periods. TPS fits and removes coherent sinusoids from 
the data - a process called harmonic removal. This al¬ 
lows detection of transits in strongly varying stars, but 
is known to attenuate the signal of short-period transits, 
often removing them all together. For short period KOIs 
(< 10 days), DV hts should be examined with great care. 

Second, both algorithms struggle for v-shaped transits. 
DV constrains the value of the impact parameter, 6, to be 



Figure 4. Comparison of measured planet radius from DV and 
MCMC fits. For clarity, only a small sample of KOIs are shown. 
The green line shows the one-to-one correspondence. As expected, 
there is good agreement between the two methods although the 
MCMC values are systematically smaller than those found by DV. 
The largest disagreement is typically for KOIs with large fit values 
for impact parameter. 

less than one (corresponding to requiring a small planet 
to pass within 1 stellar radius of the line of sight). The 
MCMC hts allow b to hoat, allowing the radius of the 
planet to grow arbitrarily large while attempting to ht a 
grazing transit. We caution that neither approach should 
be trusted for b > 0.9. 

For lower values of b, and for values of Rp/R* < 0.1 
the MCMC estimates of Rp/R* are systematically « 7% 
smaller than the values for DV. The source of this dif¬ 
ference is under investigation. This bias propagates to 
the computed values of transit depth and planet radius. 
Note that the difference between DV and MCMC hts is 
less than the typical 1 a MCMC uncertainty in the pa¬ 
rameter. 

The chief advantage of the MCMC approach is the im¬ 
proved uncertainty estimates in the presence of strong 
covariance between the ht parameters. DV and MCMC 
uncertainties are in good agreement for parameters that 
do not covary strongly such as period, epoch, and depth. 
However, the DV estimate of the uncertainty in 
(which is computed independent of transit depth uncer¬ 
tainty) typically does not agree with the MCMC value. 
Rp/R* is strongly correlated with b, and the local lin¬ 
earized uncertainty estimate employed by DV becomes 
ill-conditioned (see Table 2 of Carter et al. 2008), re¬ 
sulting in an overestimate of the uncertainty of b. This 
overestimate in Ab propagates into an overestimate in 
the planet radius uncertainty in DV results (Figure 5). 
The MCMC planet radius uncertainty is preferred to the 
DV value reported in the TCE tables at the archive (see 
§ 8) for this pipeline version. 

7. CAVEATS 

While we make every effort to produce a consistent, 
well-vetted catalog of planet candidates, a catalog of this 
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Figure 5. Comparison of the reported uncertainty in planet ra¬ 
dius from DV verus the MCMC fits. The green line shows the 
one-to-one correspondence. The uncertainties reported by DV are 
systematically larger than from MCMC fits. See text for details. 

size inevitably contains unexpected features. In this sec¬ 
tion we list the more important caveats to be aware of. 

• The disposition process uses a philosophy of “in¬ 
nocent until proven guilty”. KOIs are marked as 
planet candidates if there is insufficient or incon¬ 
clusive evidence for failure. We therefore expect 
the set of false positives to be quite pure, but the 
set of planet candidates will be contaminated with 
a set of possible false positives, or objects for which 
few tests were possible. The catalog reliability, or 
the fraction of false positives lurking in the planet 
candidate set, is an active area of research. 

• The cuts described in § 4.2 eliminated a small num¬ 
ber of viable candidates from the sample of TCEs 
to be triaged. Based on the properties of the known 
KOIs in previous catalogs, we estimate % of the 
TCEs removed by this cut would have passed hu¬ 
man triage and vetting had they been subjected to 
it. 

• Most new candidates reported in this catalog were 
detected with a MES < 10. Easier to detect 
transits have already been found with fewer data. 
Unfortunately, many of the metrics developed by 
TCERT begin to lose their diagnostic power in this 
low signal limit, with a corresponding impact on 
catalog reliability. 

• The false alarm population is dominated by TCEs 
with three detected transits and a detected MES 
< 8. Some tens of these objects survive the var¬ 
ious steps of triage and vetting and are labelled 
as candidates. While there are undoubtedly valid 
planets in this regime, random chance suggests a 
significant fraction of the large false alarm popula¬ 
tion is sufficiently transit-like to survive our tests. 


We discuss the implications of this population for 
catalog reliability in § 9.1 

• The SNR reported by the MCMC parameter fits in 
§ 6.2 is correlated with the detection MES, but with 
a large variance. This variance is probably due 
to the different detrending techniques employed by 
the pipeline and the MCMC fits. Many objects 
detected above the imposed threshold of MES > 
7.1 have smaller reported SNR values. While a 
low SNR is a cause for concern for a KOI, it is 
not sufficient reason to label the transit as a false 
alarm. 

• The best ht planet radius is not used to iden¬ 
tify false positives. Although objects with radii 
> 2 OOi? 0 are probably too large to be planets for 
all but the youngest systems (Burrows et al. 2001), 
our estimate of planet radius depends on our clas¬ 
sification of the host star. In cases where the host 
star is a late-type dwarf misclassified as an evolved 
giant, the planet radius will be dramatically over¬ 
estimated. Given the interest in M stars as planet 
hosts, and the fact that occasional misclassifica- 
tions of cool giants and dwarfs cannot be excluded 
(Brown et al. 2011; Mann et al. 2012), we do not fail 
such objects based on their inferred radius alone. 

• Our planet parameter fits assume the transiting 
body is small and non-luminous. When these as¬ 
sumptions are not true, our fits are not reliable. 

• KOIs where the ingress and egress times are a sig¬ 

nificant fraction of the total transit duration are 
most likely false positives. These v-shaped transits 
can be created by EBs for a wide range of incli¬ 
nations, but only by grazing transits in the case 
of planets. However, a v-shaped transit by itself is 
not conclusive evidence of a false positive and these 
KOIs are marked as planet candidates. In Table 3 
we provide a v-shape metric to identify such cases. 
The metric is defined as U = 1 — 6 — , where 

b is the impact parameter, and is the ratio 

of planet and star radii. KOIs with U < 0 are more 
likely to be caused by EBs. 

• TPS occasionally identifies EBs at half their true 
period; in these cases, their false-positive nature 
can be identihed by a statistically significant dif¬ 
ference in transit depth between the odd and even 
numbered transits. The results of this “odd-even” 
test are available for each KOI in the DV gen¬ 
erated reports available at the exoplanet archive 
(§ 8 ). However, the fits are seeded with the best 
fit to all transits, which can be a local minimum in 
the potential space. The depth difference is there¬ 
fore sometimes underestimated, which may cause 
a small number of false positives to slip into the 
planet candidate category. This unintentional bias 
will be removed in the next version of DV. 

• Users should not rely on the odd-even test for pe¬ 
riods >90 days. Van Eylen et al. (2013) discusses 
the various reasons why measured transit depths 
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change with CCD. In the presence of flux contami¬ 
nation from a brighter star, the differences can be 
a significant fraction of the transit depth itself. 

• Transits and other variability on bright stars can 
contaminate targets many tens of arcseconds away 
(Coughlin et al. 2014, and § 5.3). The absence of 
the core of the stellar image in the observed pixel 
mask of the contaminated star means the centroid 
tests fail to identify such cases as false positives. 
In cases where the true source is unobserved, and 
the false positive signal is only detected on a single 
target star, ephemeris matching can’t identify the 
target as a false positive. Estimating the number 
of false positives due to this effect is an ongoing 
effort. 

• The uncertainty in planet radius is driven by un¬ 
certainty in the stellar radius, which can be sur¬ 
prisingly large. For a 1 Mq star, an uncertainty in 
logg of ±0.15 dex, achievable with high-resolution 
spectroscopy (e.g., Valenti et al. 2005; Ghezzi et al. 
2010 ), translates into an uncertainty in stellar ra¬ 
dius of 15 — 20%. For stars where the best estimate 
of the radius comes from the KIC, the error in log g 
can be much larger. With asteroseismology, the 
stellar radius is measurable to within a few percent 
(e.g., Christensen-Dalsgaard et al. 2010; Chaplin 
et al. 2013; Gilliland et al. 2013), and uncertainty 
in relative planet radius due to stellar activity can 
become significant (Czesla et al. 2009). 

• Where a target star is unclassified in the input cat¬ 
alog, the parameter values default to solar: 

= 5780 K, logg= 4.438 and stellar radius equal to 
1 Rq . There are 96 KOIs with no KIC parame¬ 
ters, of which 13 are labeled as candidate, of which 
7 are newly reported in this paper. 

• The reliability of the measured planet radius is re¬ 
duced in crowded fields. Contaminating light from 
nearby stars in the optimal aperture reduces the 
observed transit depth. We attempt to correct for 
this by estimating the contaminating flux using the 
method described by Bryson et al. (2010b). These 
estimates are known to have issues, especially for 
brighter stars. Problems with the crowding calcu¬ 
lation cause the measured transit depth to change 
as a function of quarter. 

• Our planet parameters are computed assuming the 
parent star is single. If the star is a member of a 
binary system, our fits will tend to systematically 
underestimate the planet radius. 

• Our planet parameters are computed assuming the 
planetary orbit is circular. The calculated uncer¬ 
tainties are therefore biased for cases where the or¬ 
bital eccentricity is non-zero. 

• Most transits are equally spaced in time. In multi¬ 
planet systems, gravitational interaction can per¬ 
turb planet orbits leading to irregularly spaced 
transits. Such systems are said to exhibit tran¬ 
sit timing variations (TTVs) and can be used to 
confirm the planetary nature of a KOI. The TPS 


Table 1 

TCE and KOI tables at the Exoplanet Archive 


Table 

TCE Citation 

KOI Citation 

Q1-Q6 

Q1-Q8 

Q1-Q12 

Q1-Q16 

Cumulative 

Tenenbaum et al. (2013) 
Tenenbaum et al. (2014) 

Batalha et al. (2013) 
Burke et al. (2014) 
Rowe et al. (2015) 

This work 


Note. — Tables at the Exoplanet Archive and their correspond¬ 
ing citations. TCE tables for the first two catalogs were not pub¬ 
lished. The cumulative table combines results from all other cata- 
logs. 

algorithm assumes equally spaced transits and fails 
to And the most extreme TTV cases. The QATS al¬ 
gorithm (Carter et al. 2013) is better tuned to find 
such systems, and Mazeh et al. (2013) provides a 
recent catalog of such systems detected with Ke¬ 
pler. Other complicated systems such as circumbi- 
nary planets (e.g., Doyle et al. 2011) are typically 
also not detected. 

• The pipeline requires at least 3 detected transits to 
claim a detection. Long period planets with fewer 
than 3 transits are not detected regardless of their 
SNR. Kipping et al. (2014) reports the detection 
of one such event previously identified by eye as a 
single transit event in Batalha et al. (2013). 

• Our vetting procedures ultimately rely on human 
judgment. Despite every care, it is still possible for 
some errors to slip through. 

8. UNDERSTANDING THE TABLES AT THE NASA 
EXOPLANET ARCHIVE 

The best fit stellar and planetary parameters for all 
known KOIs are hosted at the NASA Exoplanet Archive 
(Akeson et al. 2013). We show a summary for the KOIs 
vetted in this paper in Tables 3 & 4. The full tables can 
be accessed through a browser^® or through the program¬ 
mers’ interface^®. The table design at the archive reflects 
our goal of supporting analysis of multiple pipeline runs 
while serving two distinct community needs: those of 
follow-up observers (e.g., Gautier et al. 2010) who need 
regular updates of new candidates, and researchers in¬ 
terested in population studies, who need a stable set of 
KOIs and best parameters from which to do science. 

For each pipeline run we post a table of TCEs (from 
Q1-Q12 onward) and a table of KOIs (from Q1-Q6 on¬ 
ward). The labels given to each catalog, and the ap¬ 
propriate citations are given in Table 1. The table cor¬ 
responding to this paper is labeled Q1-Q16. The TCE 
tables contain every event recorded by the pipeline, and 
the preliminary planet parameters estimated by DV. The 
ephemerides, fitted planetary parameters, stellar param¬ 
eters, and diagnostics are available for every TCE where 
that information was measured. The TCE tables are 
static and are never updated. 

For each pipeline run, the KOI table is initially pop¬ 
ulated by every new (i.e., triage passing) and federated 

Use the URL http://exoplanetarchive.ipac.caltech.edu 

http://exoplanetarchive.ipac.caltech.edu/docs/ 
program_interfaces.html 
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KOI. The column “Disposition Using Kepler Data” is ini¬ 
tially set to “not dispositioned” to indicate that the KOI 
has not yet been vetted. As vetting proceeds, this column 
is filled in with the phrase “candidate” or “false positive”, 
where false positive also encompasses false alarms (i.e., 
KOIs that aren’t due to the transit of one astrophysical 
body across another). Dispositions are subject to change 
as we perform quality assurance on our work. The diag¬ 
nostic reports used by TCERT for triage and vetting are 
accessible from the TCE table and KOI summary pages. 
A companion guide to help interpret these reports is also 
available^®. 

While changes are possible to the KOI table it is la¬ 
beled as “Active”. When work on the catalog is com¬ 
plete, we lock the table and change the label to “Done”. 
No further changes will be made to a locked table. 

Starting with Q1-Q12 we also include 4 flag columns 
to indicate the reasons a KOI was marked as a false pos¬ 
itive. More than one flag can be set simultaneously. In 
rare cases, a KOI may fail for reasons other than those 
indicated by the flags, in which case no flag is raised. 
The flags indicate if a KOI was determined to be: 

1. “Non-transit like”: A KOI whose light curve is not 
consistent with that of a transiting planet. This 
includes, but is not limited to, instrumental arti¬ 
facts, non-eclipsing variable stars (e.g., heartbeat 
stars, Thompson et al. 2012), and spurious detec¬ 
tions. 

2. “Significant Secondary”: A KOI that is observed 
to have a significant secondary event, meaning that 
the transit event is most likely caused by an EB. 

3. “Centroid offset”: The source of the transit was on 
a nearby star, not the target KOI 

4. “Ephemeris Match Indicates Contamination”: The 
KOI shares the same period and epoch as another 
system and is judged to be a false positive as de¬ 
scribed in § 5.3. 

Planet parameters are initially populated with the best 
fit values from DV and are replaced with parameters from 
the MCMC analysis (§ 6.2) when available. If the MCMC 
fit fails because the KOI is a false alarm, only the period, 
epoch, and duration of the detected transit are reported. 
MCMC is applied to all KOIs, whether they were found 
by the pipeline run or not. KOIs shared with the Ql- 
Q12 table have identical fit parameters because, in both 
cases, 17 quarters of data were available at the time the 
MCMC fits were performed. The original fit to the KOI 
by DV can be found in the corresponding TCE Table by 
matching the KOTs Kepler ID and planet number. 

Since no table is guaranteed to list all known KOIs at 
any given time, the exoplanet archive also hosts a cumu¬ 
lative KOI table. This table is generated automatically 
by the archive and presents the most recent, reliable in¬ 
formation available from the individual KOI tables, ac¬ 
cording to priority lists indicated on the website. As 
such, the cumulative table provides the best, most re¬ 
cent, planetary parameters and dispositions on all known 

http://exoplanetarchive.ipac.caltech.edu/docs/ 

TCERTCompanion_ql_ql6.pdf 


KOIs but the parameters and dispositions are currently 
based on in-homogeneous data sets. Depending on which 
KOI you are interested in, the planetary fits and the dis¬ 
positions are based on a different amount of data and a 
different version of the pipeline. 

9. DISCUSSION 
9.1. Catalog Reliability 

Although we have begun to replace manual inspec¬ 
tion with objective tests for triage and vetting, we still 
rely heavily on human intervention in making disposi¬ 
tions. Our methodology has improved considerably since 
Borucki et al. (2011a), but any process involving humans 
is necessarily subjective, with different people focusing 
on different features, and the performance of individu¬ 
als varying with time and circumstance. Every KOI is 
looked at by at least two different people for flux vet¬ 
ting, as is every KOI the robovetter marked as needing 
further analysis for centroid vetting. Each person marks 
a KOI as a planet candidate, a false positive, or as an 
ambiguous case needing a third opinion. 

In Figures 6 and 7 we show the numbers of KOIs shared 
by any two people and the fractional agreement in ini¬ 
tial disposition. The broad agreement indicates that our 
vetters are acting in a broadly consistent manner, and 
the disposition of a KOI does not depend strongly on 
who looked at it. Where two judgments disagreed, or 
one marked the KOI as needing further analysis, addi¬ 
tional people examined the KOI to make the final deci¬ 
sion. This ensures the difficult cases get extra scrutiny 
and considerably improves the consistency of the catalog. 

An additional check on our consistency is the 941 ob¬ 
jects vetted in this catalog that were also vetted in pre¬ 
vious catalogs. Of these, we found 49 KOIs whose dispo¬ 
sitions disagreed between this and the previous catalog. 
Of these, 20 were planet candidates in Q1-Q12 that were 
vetted as false positives in Q1-Q16, and 29 were false 
positives that were vetted as planet candidates. We re¬ 
examined each of these in more detail and confirmed the 
Q1-Q16 TCERT dispositions were correct for 36 of them, 
but the remaining 13 should be overturned to agree with 
the pre-existing dispositions. This consistency check sug¬ 
gests that our catalogs are self-consistent at the ~ 98% 
level. However, consistent vetting is not the same thing 
as correct vetting for a given KOI, as alluded to by the 
caveats listed in § 7. 

In Figure 8 we show the cumulative distribution in 
logiQ P of all KOIs found in the Kepler catalog papers 
(where P is the orbital period of the KOI). The up¬ 
per black histogram shows the KOI distribution, while 
the lower green histogram shows only those KOIs vet¬ 
ted as planet candidates. The false positive fraction, i.e., 
the fraction of KOIs not labeled as candidates, is large 
at both short (< 10 day) and long (> 100 day) periods. 
The short period false positive population is mostly com¬ 
posed of eclipsing binaries, as well as a few variable stars 
that slipped through triage. At long periods, the KOIs 
that did not become candidates are mostly instrumen¬ 
tal false alarms. It is notable that the peak in the KOI 
distribution near 372 days is not reproduced in the can¬ 
didate sample, providing confidence that TCERT was 
effective at identifying and rejecting these rolling band 
false alarms. 
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Figure 6. Left: Number of KOIs vetted by any given pair of vetters. Blue squares indicate a high overlap in the number of vetted KOIs, 
red means a low overlap, and white means that fewer than 20 KOIs were shared by a pair of vetters. For clarity, the color scale is pinned at 
600 KOIs; the most prolific vetter, DaveL looked at over 1400 KOIs. Right: Fractional agreement between any two pairs of vetters. There 
are 3 options (candidate, false positive or ambiguous), so at worst we expect ~30% agreement. The actual agreement rate is far higher, 
indicating that our vetters are acting in a broadly consistent manner. The worst correlations (yellow squares) correspond to vetters with 
small overlap, and indicate a small number of disagreements. 
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Figure 7. Same as Figure 6 but for centroid vetting. Use of the robovetter meant that humans analyzed centroid diagnostics for a smaller 
set of KOIs. 


For reference, we overplot a power law function propor- 
tional to (straight blue line). If we naively assume 

that planets are uniformly distributed in period, and the 
planet detection rate was driven by the geometric prob¬ 
ability of transit, we would expect the planet candidate 
distribution to follow this trend. (Pepper et al. (2003) 
provides a more detailed prediction of the expected re¬ 
turn from a planet survey.) The candidate distribution 
does follow the trend from 10-100 days then turns up¬ 
wards slightly. This may be because planets are more 
common at P > 100 days, but a more prosaic explanation 
is that the upturn is caused by an excess of false alarm 
KOIs that pass through vetting in this period range. In 
previous catalogs we were able to fit our KOIs with more 
data than was available when they were originally de¬ 
tected, and we used that fact to identify false alarms. 
The loss of the reaction wheels means that approach is 


no longer available. 

The false alarm population is dominated by TCEs with 
3 transits and low detection strength (MES). Artifacts 
with large MES are more effectively identified and re¬ 
moved by TPS. In Figure 9 we plot the fraction of planet 
candidates with 3 transits as a function of MES. The 
majority of 3 transit events have MES<8. If we assume 
for the sake of simplicity that all planet candidates with 
three transits are false alarms, our reliability at MES<8 
is 66%. This number is a rough estimate, but serves as 
reminder that, because TCERT errs on the side marking 
ambiguous cases as candidates, our planet sample is con¬ 
taminated by design with events not caused by planets. 

9.2. Multiple-KOI Systems 

With this catalog there are several new multi-KOI sys¬ 
tems. Here we give a brief overview and identify dif- 































Kepler KOI Catalog VI 


13 


ferences between this catalog and the catalog of Burke 
et al. (2014), which used roughly half the data used in 
this work. 

For this comparison, we select all multi-KOI systems in 
each catalog that do not have any planet pairs with a pe¬ 
riod ratio smaller than 1.1, eliminating putative systems 
that are likely to be dynamically unstable or split multi¬ 
planet systems such as Kepler-132 (K00284). In the Q8 
catalog there are 2412 unique KOI systems where 480 
are multi-KOI systems, these comprise 3136 total planet 
candidates with 1204 candidates in multi-KOI systems. 
Our catalog increases this yield to 3872 total KOI sys¬ 
tems with 608 multi-KOI systems comprising 4756 to¬ 
tal planet candidates with 1492 appering in multi-KOI 
systems. The multiplicity of these 128 new multi-KOI 
systems include a net gain of 107 two-planet, 14 three- 
planet, 3 four-planet, and 4 five-planet systems. Fig¬ 
ure 10 shows a histogram of the previous and new mult- 
planet systems as a function of multiplicity. 



Figure 8. Period distribution of all KOIs found by Kepler (upper 
black line), and all candidates (lower green line, labeled PC). The 
peak in the KOI distribution near 372 days due to rolling band 
noise is eliminated in the candidate sample. The straight blue line 
shows the expected slope of the distribution if detection rates are 
driven by the geometric probability of transit. Both the KOI and 
PC populations show more KOIs at periods > 100 days which is 
likely caused by the population of false alarms that survive triage 
and vetting respectively. 


9.3. Habitable Zone Planets 

The habitable zone (HZ) planets, those that orbit their 
parent stars at distances that allow liquid water to exist 
on their surface, are of particular interest for follow-up, 
validation, and confirmation. The first theoretical lim¬ 
its on the extent of the HZ come from Kasting et al. 
(1993). Confirmed small planets in or near the HZ that 
were originally found in Kepler data include Kepler-69c 
(Barclay et al. 2013), Kepler-62f (Borucki et al. 2013), 
and Kepler-186f (Quintana et al. 2014). 

To help identify potential habitable zone KOIs, we pro¬ 
vide an estimate of equilibrium temperature, Teq, in the 
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Figure 9. Distribution of the detection statistic (MES) for all 
new planet candidates listed in Table 3 (white histogram), and the 
subset of new candidates with only three detected transits (grey). 
TCEs with three detected transits and low MES constitute the 
majority of the false alarms. At MES < 8 there are 219 candidates, 
and 75 have only three detected transits, suggesting the reliability 
in this low SNR limit ~66% 
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Figure 10. A histogram of the number of KOI systems as a func¬ 
tion of multiplicity. Results from the Burke et al. (2014) catalog 
are shown in light gray while results from this work are darker gray. 
Raw counts are shown in the top panel. The bottom panel shows 
the relative contributions to the total of the two catalogs. 

tables at the NASA Exoplanet Archive. We calculate 

Teq = Teff(T*/2a)i/"[/(l - Ab)]1/^ (2) 
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Table 2 

Table of Small HZ candidates 


KOI 

Radius 

(R®) 

Teq 

(K) 

SNR 

Tsff 

(K) 

Kp 

K02184.02 


2881“ 

9.2 

4893 

15.5 

K02194.03 

1 4Q+0-56 

2391^^ 

12.1 

6038 

13.9 

K05068.01 

1 [r'7+0.52 

-*-•^'-0.23 

290l™ 

15.1 

6440 

13.1 

K05236.01 

1 qo+0.98 

240l“ 

22.5 

6241 

13.1 

K05737.01 

1 of^+0.51 
-‘-•'^^-0.14 

2541:11 

6.3 

5916 

13.8 

K05805.01 

-L-o4_o 12 

174lg^ 

14.2 

5192 

14.5 


Note. — Table of strong candidates for rocky 
HZ planets. See text for selection criteria, and the Ex¬ 
oplanet Archive for the full set of parameters. 


where Teff and are the effective temperature and ra- 
dius of the parent star, a is the assumed orbital semi¬ 
major axis, / parameterizes thermal circulation from the 
day to night sides, and is the Bond albedo. We as¬ 
sume full thermal circulation (/ = 1) and = 0.3. 
Kopparapu et al. (2013) suggest using incident stellar in¬ 
solation relative to the Earth, SeS, in preference to Teq. 
Values of Teq reported at the archive can be converted 
to insolation flux with the formula 

(3) 

where T,^ is 255 K, the computed equilibrium temper¬ 
ature of the Earth given our assumptions. Kopparapu 
et al. (2013) predict an inner edge to the HZ in the Solar 
System of Seft between 1.015 (for a conservative, cloud 
free model) and 1.77 (based on models of early Venus), 
corresponding to T^q between 255 and 294 K. 

Within the HZ, planets small enough to have a solid 
surface (rocky planets), are of particular interest. Rogers 
(2014) determined the upper bound for the radius of a 
rocky planet is 2.0 i? 0 with 68 % confidence, based on the 
extensive radial velocity follow-up of small Kepler candi¬ 
dates by Marcy et al. (2014). 

In this section, we highlight a few individual candidates 
with R < 2 i ?0 and Teq that places them in or near the 
HZ. K05202.01 {R = 1.8i?®, Teq= 227K), K05506.01 
{R = 1.6i?®, Teq= 230K), K05856.01 {R = 1.7i?®, 
Teq= 280K), K06151.01(i? = 1.4T®, Teq= 210 K) are 
among our lowest SNR detections. We show an example 
lightcurve in Figure 11. Although they are detected with 
sufficient formal significance, it is challenging to distin¬ 
guish genuine transits from other effects in the lightcurve 
at this noise level. Less vetting is possible on such shal¬ 
low KOls, and the possibility of misidentification as a 
planet candidate is correspondingly higher. 

In Table 2 we list our strongest candidates for rocky 
HZ planets. In addition to our criteria for radius and 
Teq, these KOIs are either detected with MES > 8 or 
with > 3 transits. They have measured impact param¬ 
eters < 0.8 (to guard against unreliable fits), and the 
transit is visible by eye in the PDC lightcurve. This re¬ 
duces the risk of artifacts contaminating our sample of 
strongest candidates. These are our best candidates for 
habitable, rocky worlds, and prime candidates for con¬ 
firmation and validation. We note that the estimated 


Teq for K05805.01 is colder than outer limit of the HZ 
computed by Kopparapu et al. (2013) based on evidence 
that water may have existed on Mars in the past (190 K 
given our assumptions). 

Two KOIs from Table 2 are in a multi-KOI systems. 
Lissauer et al. (2012) argue almost all candidates in 
multiple systems, such as these, are true planets. The 
K02184 system contains two KOIs, one in the HZ and 
one with a period of 2 days (K02184.01). This sys¬ 
tem is the exception to the rule in that K02184.01 is 
a false positive due to a background event. The source 
of the K02184.02 event is consistent with being on the 
target star. The K02194 system has three planet candi¬ 
dates, all smaller than 2i?®. Because the Lissauer et al. 
( 2012 ) analysis does not include a treatment of non- 
astrophysical false alarms (Rowe et al. 2014), which is 
a concern for the outer most candidate, we do not claim 
K02194.03 has been statistically validated, although it 
has the strongest claim of any KOI in the table to being 
due to a transiting planet. 

Even if confirmed or validated, further work is required 
to confirm the status of these KOIs as rocky, habitable 
zone planets. These KOIs have large (~ 25%) uncertain¬ 
ties in their stellar radii, which corresponds to a similar 
uncertainty in planet radius, and a ~ 30% uncertainty 
in Teq. With these uncertainties, we can not say that 
our measurements securely identify any of these KOIs 
as rocky, HZ candidates. However, given the size of the 
sample it is likely that at least one (or more) of these 
KOIs has a true radius and Teq that meets our criteria. 
Follow-up observations are needed to constrain stellar 
radii and identify the true properties of these candidates. 
Until such follow-up work is complete, these KOIs rep¬ 
resent the best, closest analogs to the Earth known to 
date. 

10. CONCLUSION 

We describe the latest planet candidate catalog for the 
Kepler mission. With four years of data, Kepler is now 
sensitive to smaller, and longer period planets than be¬ 
fore. We discuss some of the caveats that users should 
be aware of when using the catalog, such as our choice to 
err on the side of including a KOI as a planet candidate 
in the face of uncertain evidence, and the challenges of 
vetting extremely deep and shallow transit events. 

The false alarm rate increases for small, long period 
planets due to additional sources of spurious events in 
this regime. Although we eliminate the majority of false 
alarms and false positives, some remain in the final cat¬ 
alog, particularly at low signal to noise. 

We highlight a handful of possibly rocky planets in or 
near the HZs of their parent stars, including K02194.03, 
the third candidate smaller than 2 i?® in a multiple sys¬ 
tem, and K05805.01 whose Tgq suggests it is colder than 
the outer HZ limit proposed by Kopparapu et al. (2013). 

We make our first steps towards automating the iden¬ 
tification of planet candidates and false positives, which 
will help remove some of the subjectivity and human 
error of previous releases. The full table of KOIs is avail¬ 
able at the Exoplanet Archive hosted at NExScI. 

Funding for this Discovery mission is provided by 
NASAs Science Mission Directorate. Some of the data 
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Figure 11. An example KOI at the detection threshold, 
K05202.01. The bottom section shows the folded lightcurve cen¬ 
tered on the transit. The upper sections show the individual events, 
vertically offset for clarity. The transit is barely detected, with a 
MES of 7.8, and the individual events are scarcely visible. Because 
there is no clear reason to mark this KOI as a false positive it is 
marked as a planet candidate. 

presented in this paper were obtained from the Mikulski 
Archive for Space Telescopes (MAST). STScI is operated 
by the Association of Universities for Research in Astron¬ 
omy, Inc., under NASA contract NAS5-26555. Support 
for MAST for non-HST data is provided by the NASA 
Office of Space Science via grant NNX13AC07G and by 
other grants and contracts. This research has made use 
of the NASA Exoplanet Archive, which is operated by 
the California Institute of Technology, under contract 
with the National Aeronautics and Space Administration 
under the Exoplanet Exploration Program. We would 
like to thank the Exoplanet Archive staff for their efforts 
in supporting the Kepler pipeline data products. Fund¬ 
ing for the Stellar Astrophysics Centre is provided by 
The Danish National Research Foundation (Crant agree¬ 
ment no.: DNRFI06). The research is supported by the 
ASTERISK project (ASTERoseismic Investigations with 
SONC and Kepler) funded by the European Research 
Council (Crant agreement no.: 267864). 

APPENDIX 

NASA missions tend to accumulate acronyms. Here 
we provide a summary of the more important ones used 
in this paper for easy reference. 

CAL: Module of the Kepler pipeline that calibrates the 
recorded pixel values. 

DV: Data Validation. Module of the Kepler pipeline 
that fits the transit parameters and provides diagnostics 
used in triage and vetting. 

EB: Eclipsing Binary star. 

HZ: Habitable Zone. Region around a star where a 
planet could have surface temperatures consistent with 
liquid water. 

KIC: Kepler Input Catalog. This is the catalog used 
in target selection, and provides stellar parameters for 
much of the sample. 

KOI: Kepler Object of Interest. A unique identifier of a 
transit event. Some KOIs are marked as false positive 
to indicate that the transit event is not due to a planet. 


MES: Multiple Event Statistic. The signal-to-noise ratio 
of the detection of a TCE by the TPS module of the 
pipeline. 

PA: Photometric Aperture. The pipeline module that 
extracts photometry. 

PC: Planet Candidate. 

PDC: Presearch Data Conditioning. The pipeline mod¬ 
ule that removes instrumental signals from lightcurves. 
TCE: Threshold Crossing Event. A period set of dips in 
a lightcurve that may be due to a transit. 

TCERT: TCE Review Team. A committee that reviews 
TCEs and KOIs to identify false alarms and false 
positives. 

TPS: Transit Planet Search. The module of the pipeline 
that searches for transits. Potential transits (TCEs) are 
passed to DV to be fit. 

TTV: Transit Timing Variations. Irregularly spaced 
transits due to gravitational interaction in multi-planet 
systems. 

Facilities'. Kepler 
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Table 3 

New KOIs discovered in 16 Quarters of Data 


KOI 

Keplerld 

TCE 

Status 

V-Shape 

Flag 

Period 

Epoch 

Depth 

Radius 







(Days) 

(BKJD) 

(ppm) 

(R®) 

K00099.02 

8505215 

1 

FP 

- 

0 

79.878049 

184.770441 

- 

_ 

K00129.02 

11974540 

2 

FP 

- 

0 

143.211094 

221.755647 

- 

- 

K00238.03 

7219825 

3 

FP 

0.598 

0 

362.997(26) 

256.446(45) 

282(31) 

1.86l“;?i 

K00266.02 

7375348 

2 

PC 

0.666 

0 

47.74360(28) 

160.1645(51) 

102.7(6.4) 

1 oo+O-SO 

J--o^_0.63 

K00337.02 

10545066 

2 

PC 

0.895 

0 

154.6074(43) 

250.520(21) 

280(23) 

1 

-*^•^^-0.13 

K00353.03 

11566064 

3 

PC 

0.790 

0 

11.16223(13) 

133.523(11) 

98.1(9.3) 

1 QQ + O.Sd 

J-.oo_o.2i 

K00365.02 

11623629 

2 

FP 

0.743 

0 

117.7610(22) 

175.560(16) 

53.9(8.2) 

^•'^^-0.03 

K00423.02 

9478990 

2 

FP 

-143.711 

0 

360.4776(99) 

253.911(20) 

604(117) 

1194S 77+2456.24 

1 1 _2393.47 

K00492.02 

3559935 

2 

PC 

0.645 

1 

265.296(15) 

297.870(24) 

608(134) 

2 

^•^»_0.60 

K00520.04 

8037145 

4 

PC 

0.530 

0 

51.16579(64) 

172.308(11) 

305(25) 

1 54+*J-55 


Note. — All new KOIs discovered in Q1-Q16 data not found in the earlier Kepler catalogs. The full 
set of fitted parameters can be found at the Exoplanet Archive (§8); we show only a summary set here. 
KOI is a unique identifier assigned to every Kepler Object of interest. Kepler Id is a unique identifier 
assigned to every target star in the KIC (Brown et al. 2011). The TCE number indicates the order in 
which the pipeline found this event around this target. A status of FP (false positive) indicates that we 
believe the KOI is not a bona-fide planet; PC (Planet Candidate) indicates that we have no compelling 
evidence that the signal is not due to a planet. The v-shape statistic is not included at the Exoplanet 
Archive and is described in § 7. The flag column is set for KOIs detected with three transits and MES 
<8. As discussed in § 9.1, we expect lower reliability for this population of candidates. The numbers in 
parentheses indicate the la uncertainty in the two least significant digits. For example, 1.23(45) = 1.23 
± 0.45. In cases where the MCMC fit failed to converge we report the period and epoch of the detection 
only. Some KOIs, such as K00423.02, have extremely large reported radii. These are either extremely 
V-shaped transits that are difficult to fit with a planet model, or low SNR false alarms where the MCMC 
fits struggle to converge on a sensible value. (This table in available in its entirety in a machine-readable 
form in the online journal. A portion is shown here for guidance regarding its form and content.) 
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Table 4 

Previously Known KOIs vetted with 16 Quarters of Data 


KOI 

Keplerld 

TCE 

Status 

V-Shape 

Flag 

Period 

Epoch 

Depth 

Radius 







(Days) 

(BKJD) 

(ppm) 

(R®) 

K00157.01 

6541920 

2 

PC 

0.933 

0 

13.024928(13) 

138.17627(83) 

809.2(6.8) 


K00157.02 

6541920 

3 

PC 

0.956 

0 

22.687141(25) 

148.45544(92) 

995.5(8.0) 

o 20+0.24 

o.zo_o 20 

K00157.03 

6541920 

1 

PC 

0.854 

0 

31.995506(28) 

154.16149(72) 

1411(12) 

o qq+0.28 

o.O\)_o 24 

K00157.04 

6541920 

5 

PC 

0.277 

0 

46.68580(13) 

225.0421(20) 

605(11) 

2 74+0-20 

K00157.05 

6541920 

4 

PC 

0.861 

0 

118.37838(31) 

287.2891(20) 

1117(14) 

'J-^'0_0.22 

K00157.06 

6541920 

6 

PC 

0.972 

0 

10.304005(17) 

138.5042(14) 

320.2(6.1) 

1 »[^+0-13 

K00298.02 

12785320 

2 

PC 

0.180 

0 

57.38397(29) 

170.7285(43) 

237(13) 

1 fiJ^"t0-00 

I'llmr 

'^•^-‘^-0.29 

K00536.01 

10965008 

1 

PC 

0.418 

0 

81.16943(32) 

178.5932(31) 

1233(26) 

K00638.01 

5113822 

1 

PC 

0.863 

0 

23.642191(25) 

172.58943(85) 

1171(11) 

Q Qy+1.49 
o-o<^_0.6l 

K00638.02 

5113822 

2 

PC 

0.928 

0 

67.09330(14) 

146.5670(15) 

1294(16) 

+-L>D_o g3 


Note. — Previously known KOIs revetted with 16 quarters of data. A known KOI was revetted if 
it, or any other KOI around the same target star has a period > 50 days. Some additional KOIs also 
received further scrutiny, and their new statuses are included here. See Table 3 for a description 
of the columns. (This table in available in its entirety in a machine-readable form in the online 
journal. A portion is shown here for guidance regarding its form and content.) 



